Search CORE

116 research outputs found

Analysis of Multicore CPU and GPU toward Parallelization of Total Focusing Method ultrasound reconstruction

Author: Bimbard Franck
Gens Guillaume
Iakovleva Ekaterina
Lacassagne Lionel
Lambert Jason
Pédron Antoine
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/10/2012
Field of study

International audienceUltrasonic imaging and reconstruction tools are commonly used to detect, identify and measure defects in different mechanical parts. Due to the complexity of the underlying physics, and due to the ever growing quantity of acquired data, computation time is becoming a limitation to the optimal inspection of a mechanical part. This article presents the performances of several implementations of a computational heavy algorithm, named Total Focusing Method, on both Graphics Processing Units (GPU) and General Purpose Processors (GPP). The scope of this study is narrowed to planar parts tested in immersion for defects. Using algorithmic simplifications and architectural optimizations, the algorithm has been drastically accelerated resulting in memory-bound implementations. On GPU, high performances can be achieved by profiting from GPU long memory transactions and from hand managed memory. Whereas on GPP, computations cost are overrun by memory access resulting in less efficient performances compared to the computing capabilities available. The following study constitutes the first step toward analyzing the target algorithm for diverse hardware in the non-destructive testing environment

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-CEA

HAL-Rennes 1

Generic programming methods for the real time implementation of a MRF based motion detection algorithm on a multi-processor DSP with multidimensional DMA

Author: GARDA Patrick
LACASSAGNE Lionel
LOHIER Frantz
Publication venue: GRETSI, Groupe d’Etudes du Traitement du Signal et des Images
Publication date: 01/01/1999
Field of study

Cette communication adresse une double problématique. D'abord, nous soulignons le besoin de méthodes de programmation génériques pour l'implémentation temps réel (TR) d'algorithmes de traitement d'image bas niveau complexes sur des architectures DSPs parallèles à base de multiprocesseurs exploitant le parallélisme au niveau instructions et de DMAs multidimensionnels. Ensuite, nous introduisons le besoin d'une implémentation TR d'un algorithme de détection de mouvement sur des architectures compatibles avec des systèmes bas coût embarqués. Pour répondre à ces besoins, nous montrons comment une méthodologie de gestion des flots synchrones reposant sur le DMA et qui se veut dynamique et générique sur le plan des configurations de traitement (suivant la nature des chaînes de traitement, la taille des images et du nombre de processeurs impliqués) peut être utilisée pour l'implémentation d'une méthode Markovienne de détection de mouvement sur l'architecture parallèle avancée du TMS320C80. Cette étude de cas montre l'adéquation de notre méthode et introduit un facteur d'accélération de 4 par rapport aux durées de traitement précédemment publiées de l'algorithme ciblé. Plus encore, on estime que le traitement TR est possible sur des images 2562 avec un système C80 optimal

I-Revues

Suivi temps-réel (matrices de covariance couleur-texture et commutation automatique de descripteur/opérateur)

Author: LACASSAGNE Lionel.
ROMERO MIER Y TERAN Andrés
Publication venue
Publication date: 01/01/2013
Field of study

Ces technologies ont poussé les chercheurs à imaginer la possibilité d'automatiser et émuler les capacités de perception visuels des animaux et de l'homme lui-même. Depuis quelques décennies le domaine de la vision par ordinateur a essayé plusieurs approches et une vaste gamma d'applications a été développée avec un succès partielle: la recherche des images basé sur leur contenu, la exploration de donnés à partir des séquences vidéo, la ré-identification des objets par des robots, etc. Quelques applications sont déjà sur le marché et jouissent déjà d'un certain succès commercial.La reconnaissance visuelle c'est un problème étroitement lié à l'apprentissage de catégories visuelles à partir d'un ensemble limité d'instances. Typiquement deux approches sont utilisées pour résoudre ce problème: l'apprentissage des catégories génériques et la ré-identification d'instances d'un objet un particulière. Dans le dernier cas il s'agit de reconnaître un objet ou personne en particulière. D'autre part, la reconnaissance générique s'agit de retrouver tous les instances d'objets qui appartiennent à la même catégorie conceptuel: tous les voitures, les piétons, oiseaux, etc.Cette thèse propose un système de vision par ordinateur capable de détecter et suivre plusieurs objets dans les séquences vidéo. L'algorithme pour la recherche de correspondances proposé se base sur les matrices de covariance obtenues à partir d'un ensemble de propriétés des images (couleur et texture principalement). Son principal avantage c'est qu'il utilise un descripteur qui permet l'introduction des sources d'information très hétérogènes pour représenter les cibles. Cette représentation est efficace pour le suivi d'objets et son ré-identification.Quatre contributions sont introduites dans cette thèse.Tout d'abord cette thèse s'intéresse à l'invariance des algorithmes de suivi face aux changements du contexte. Nous proposons ici une méthodologie pour mesurer l importance de l'information couleur en fonction de ses niveaux d illumination et saturation. Puis, une deuxième partie se consacre à l'étude de différentes méthodes de suivi, ses avantages et limitations en fonction du type d'objet à suivre (rigide ou non rigide par exemple) et du contexte (caméra statique ou mobile). Le méthode que nous proposons s'adapte automatiquement et utilise un mécanisme de commutation entre différents méthodes de suivi qui considère ses qualités complémentaires. Notre algorithme se base sur un modèle de covariance qui fusionne les informations couleur-texture et le flot optique (KLT) modifié pour le rendre plus robuste et adaptable face aux changements d illumination. Une deuxième approche se appuie sur l'analyse des différents espaces et invariants couleur à fin d'obtenir un descripteur qui garde un bon équilibre entre pouvoir discriminant et robustesse face aux changements d'illumination.Une troisième contribution porte sur le problème de suivi multi-cibles ou plusieurs difficultés apparaissent : la confusion d'identités, les occultations, la fusion et division des trajectoires-détections, etc.La dernière partie se consacre à la vitesse des algorithmes à fin de fournir une solution rapide et utilisable dans les applications embarquées. Cette thèse propose une série d'optimisations pour accélérer la mise en correspondance à l'aide de matrices de covariance. Transformations de mise en page de données, la vectorisation des calculs (à l'aide d'instructions SIMD) et certaines transformations de boucle permettent l'exécution en temps réel de l'algorithme non seulement sur les grands processeurs classiques de Intel, mais aussi sur les plateformes embarquées (ARM Cortex A9 et Intel U9300).Visual recognition is the problem of learning visual categories from a limited set of samples and identifying new instances of those categories, the problem is often separated into two types: the specific case and the generic category case. In the specific case the objective is to identify instances of a particular object, place or person. Whereas in the generic category case we seek to recognize different instances that belong to the same conceptual class: cars, pedestrians, road signs and mugs. Specific object recognition works by matching and geometric verification. In contrast, generic object categorization often includes a statistical model of their appearance and/or shape.This thesis proposes a computer vision system for detecting and tracking multiple targets in videos. A preliminary work of this thesis consists on the adaptation of color according to lighting variations and relevance of the color. Then, literature shows a wide variety of tracking methods, which have both advantages and limitations, depending on the object to track and the context. Here, a deterministic method is developed to automatically adapt the tracking method to the context through the cooperation of two complementary techniques. A first proposition combines covariance matching for modeling characteristics texture-color information with optical flow (KLT) of a set of points uniformly distributed on the object . A second technique associates covariance and Mean-Shift. In both cases, the cooperation allows a good robustness of the tracking whatever the nature of the target, while reducing the global execution times .The second contribution is the definition of descriptors both discriminative and compact to be included in the target representation. To improve the ability of visual recognition of descriptors two approaches are proposed. The first is an adaptation operators (LBP to Local Binary Patterns ) for inclusion in the covariance matrices . This method is called ELBCM for Enhanced Local Binary Covariance Matrices . The second approach is based on the analysis of different spaces and color invariants to obtain a descriptor which is discriminating and robust to illumination changes.The third contribution addresses the problem of multi-target tracking, the difficulties of which are the matching ambiguities, the occlusions, the merging and division of trajectories.Finally to speed algorithms and provide a usable quick solution in embedded applications this thesis proposes a series of optimizations to accelerate the matching using covariance matrices. Data layout transformations, vectorizing the calculations (using SIMD instructions) and some loop transformations had made possible the real-time execution of the algorithm not only on Intel classic but also on embedded platforms (ARM Cortex A9 and Intel U9300).PARIS11-SCD-Bib. électronique (914719901) / SudocSudocFranceF

OpenGrey Repository

Parallelization of a new embedded application for automatic meteor detection

Author: Cassagne Adrien
Ciocan Clara
Kandeepan Mathuran
Lacassagne Lionel
Publication venue
Publication date: 20/07/2023
Field of study

This article presents the methods used to parallelize a new computer vision application. The system is able to automatically detect meteor from non-stabilized cameras and noisy video sequences. The application is designed to be embedded in weather balloons or for airborne observation campaigns. Thus, the final target is a low power system-on-chip (< 10 Watts) while the software needs to compute a stream of frames in real-time (> 25 frames per second). For this, first the application is split in a tasks graph, then different parallelization techniques are applied. Experiment results demonstrate the efficiency of the parallelization methods. For instance, on the Raspberry Pi 4 and on a HD video sequence, the processing chain reaches 42 frames per second while it only consumes 6 Watts.Comment: in French language, COMPAS 2023 - Conf{\'e}rence francophone d'informatique en Parall{\'e}lisme, Architecture et Syst{\`e}me, Jul 2023, Annecy (France), Franc

arXiv.org e-Print Archive

Implémentation temps réel d'algorithme de détection de mouvement par champs de markov sur RISC et DSP C6x

Author: GARDA Patrick
LACASSAGNE Lionel
LOHIER Frantz
MILGRAM Maurice
Publication venue: GRETSI, Groupe d’Etudes du Traitement du Signal et des Images
Publication date: 01/01/1999
Field of study

- Cet article décrit une implémentation temps réel d'un algorithme de détection de mouvement basée sur les champs de Markov. Il décrit aussi les optimisations architecturales appliquées aux RISC et au DSP C6x qui permettent d'atteindre le temps réel

I-Revues

A New Real-Time Embedded Video Denoising Algorithm

Author: Bouyer Manuel
Gaillard Boris
Lacassagne Lionel
Lemaitre Florian
Masliah Ian
Meunier Quentin
PETRETO Andrea
Romera Thomas
Publication venue: HAL CCSD
Publication date: 16/10/2019
Field of study

International audienceMany embedded applications rely on video processing or on video visualization. Noisy video is thus a major issue for such applications. However, video denoising requires a lot of computational effort and most of the state-of-the-art algorithms cannot be run in real-time at camera framerate. This article introduces a new real-time video denoising algorithm for embedded platforms called RTE-VD. We first compare its denoising capabilities with other online and offline algorithms. We show that RTE-VD can achieve real-time performance (25 frames per second) for qHD video (960×540 pixels) on embedded CPUs and the output image quality is comparable to state-of-the-art algorithms. In order to reach real-time denoising, we applied several high-level transforms and optimizations (SIMDization, multi-core parallelization, operator fusion and pipelining). We study the relation between computation time and power consumption on several embedded CPUs and show that it is possible to determine different frequency and core configurations in order to minimize either the computation time or the energy

Crossref

Real-time embedded video denoiser prototype

Author: Bouyer Manuel
Gaillard Boris
Lacassagne Lionel
Lemaitre Florian
Menard Patrice
Meunier Quentin
Petreto Andrea
Romera Thomas
Publication venue: HAL CCSD
Publication date: 28/01/2020
Field of study

International audienceLow light or other poor visibility conditions often generate noise on any vision system. However, video denoising requires a lot of computational effort and most of the state-of-the-art algorithms cannot be run in real-time at camera framerate. Noisy video is thus a major issue especially for embedded systems that provide low computational power. This article presents a new real-time video denoising algorithm for embedded platforms called RTE-VD [1]. We first compare its denoising capabilities with other online and offline algorithms. We show that RTE-VD can achieve real-time performance (25 frames per second) for qHD video (960x540 pixels) on embedded CPUs with an output image quality comparable to state-of-the-art algorithms. In order to reach real-time denoising, we applied several high-level transforms and optimizations. We study the relation between computation time and power consumption on several embedded CPUs and show that it is possible to determine find out frequency and core configurations in order to minimize either the computation time or the energy. Finally, we introduce VIRTANS our embedded real-time video denoiser based on RTE-VD

Extensions SIMD des jeux d'instructions

Author: Etiemble Daniel
Lacassagne Lionel
Publication venue: Techniques de l'ingénieur
Publication date: 10/02/2015
Field of study

International audienceCet article décrit les extensions SIMD des jeux d'instructions desmicroprocesseurs. Les différentes extensions SSE et AVX de IA-32 et Intel64 (Intel), lesextensions Neon d'ARM et les différentes variantes d'IBM (Altivec) sont prises commeexemple. L'article montre les spécificités de l'arithmétique entière, du traitement desstructures conditionnelles, des accès mémoire. Il montre comment les extensionscomprennent des extensions naturelles des instructions scalaires, et des instructions adhoc destinées à des applications particulières. Ces instructions s'utilisent soit en aidant lecompilateur à «vectoriser», soit en utilisant des intrinsèques, qui sont des appels defonctions correspondant aux instructions à insérer dans un programme C ou C++

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

A new Direct Connected Component Labeling and Analysis Algorithm for GPUs

Author: Hennequin Arthur
Lacassagne Lionel
Publication venue: HAL CCSD
Publication date: 17/03/2019
Field of study

International audienc